-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Detect and throw error when model output or model metadata files are deleted/modified. #63
Conversation
Demo of functionality including modification & deletion of model output files + deletion of model metadata file in a PR library(hubValidations)
temp_hub <- fs::path(tempdir(), "mod_del_hub")
gert::git_clone(
url = "https://github.com/Infectious-Disease-Modeling-Hubs/ci-testhub-simple",
path = temp_hub,
branch = "test-mod-del"
)
validate_pr(
hub_path = temp_hub,
gh_repo = "Infectious-Disease-Modeling-Hubs/ci-testhub-simple",
pr_number = 6,
skip_submit_window_check = TRUE
)
#> ℹ PR contains commits to additional files which have not been checked:
#> • ".github/workflows/validate_submission.yaml"
#> • "README.md"
#> • "model-metadata/README.md"
#> • "model-output/hub-baseline/README.txt"
#> • "random-file.txt"
#> ✔ mod_del_hub: All hub config files are valid.
#> ✖ 2022-10-08-hub-baseline.csv: Previously submitted model output files must not
#> be modified. 'model-output/hub-baseline/2022-10-08-hub-baseline.csv'
#> modified.
#> ✖ 2022-10-15-team1-goodmodel.csv: Previously submitted model output files must
#> not be removed.
#> 'model-output/team1-goodmodel/2022-10-15-team1-goodmodel.csv' removed.
#> ✖ team1-goodmodel.yaml: Previously submitted model metadata files must not be
#> removed. 'model-metadata/team1-goodmodel.yaml' removed.
#> ✔ 2022-10-08-hub-baseline.csv: File exists at path
#> 'model-output/hub-baseline/2022-10-08-hub-baseline.csv'.
#> ✔ 2022-10-08-hub-baseline.csv: File name "2022-10-08-hub-baseline.csv" is
#> valid.
#> ✔ 2022-10-08-hub-baseline.csv: File directory name matches `model_id` metadata
#> in file name.
#> ✔ 2022-10-08-hub-baseline.csv: `round_id` is valid.
#> ✔ 2022-10-08-hub-baseline.csv: File is accepted hub format.
#> ✔ 2022-10-08-hub-baseline.csv: Metadata file exists at path
#> 'model-metadata/hub-baseline.yml'.
#> ✔ 2022-10-08-hub-baseline.csv: File could be read successfully.
#> ✔ 2022-10-08-hub-baseline.csv: `round_id_col` name is valid.
#> ✔ 2022-10-08-hub-baseline.csv: `round_id` column "origin_date" contains a
#> single, unique round ID value.
#> ✔ 2022-10-08-hub-baseline.csv: All `round_id_col` "origin_date" values match
#> submission `round_id` from file name.
#> ✔ 2022-10-08-hub-baseline.csv: Column names are consistent with expected round
#> task IDs and std column names.
#> ✔ 2022-10-08-hub-baseline.csv: Column data types match hub schema.
#> ✔ 2022-10-08-hub-baseline.csv: `tbl` contains valid values/value combinations.
#> ✔ 2022-10-08-hub-baseline.csv: All combinations of task ID
#> column/`output_type`/`output_type_id` values are unique.
#> ✔ 2022-10-08-hub-baseline.csv: Required task ID/output type/output type ID
#> combinations all present.
#> ✔ 2022-10-08-hub-baseline.csv: Values in column `value` all valid with respect
#> to modeling task config.
#> ✔ 2022-10-08-hub-baseline.csv: Values in `value` column are non-decreasing as
#> output_type_ids increase for all unique task ID value/output type
#> combinations of quantile or cdf output types.
#> ℹ 2022-10-08-hub-baseline.csv: No pmf output types to check for sum of 1. Check
#> skipped.
#> ✔ 2022-10-22-team1-goodmodel.csv: File exists at path
#> 'model-output/team1-goodmodel/2022-10-22-team1-goodmodel.csv'.
#> ✔ 2022-10-22-team1-goodmodel.csv: File name "2022-10-22-team1-goodmodel.csv" is
#> valid.
#> ✔ 2022-10-22-team1-goodmodel.csv: File directory name matches `model_id`
#> metadata in file name.
#> ✔ 2022-10-22-team1-goodmodel.csv: `round_id` is valid.
#> ✔ 2022-10-22-team1-goodmodel.csv: File is accepted hub format.
#> ✖ 2022-10-22-team1-goodmodel.csv: Metadata file does not exist at path
#> 'model-metadata/team1-goodmodel.yml' or
#> 'model-metadata/team1-goodmodel.yaml'. Created on 2023-11-28 with reprex v2.0.2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After running the code locally, I have some questions. Some of the tests are falling locally might be because of a version of the rlang package maybe:
── Failure ('test-validate_model_file.R:58:3'): validate_model_file print method work [fancy] ──
Snapshot of code has changed:
old[1:6] vs new[1:6]
Code
validate_model_file(hub_path, file_path = "team1-goodmodel/2022-10-15-hub-baseline.csv")
- Message
+ Message <rlang_message>
✔ 2022-10-15-hub-baseline.csv: File exists at path model-output/team1-goodmodel/2022-10-15-hub-baseline.csv.
✔ 2022-10-15-hub-baseline.csv: File name "2022-10-15-hub-baseline.csv" is valid.
! 2022-10-15-hub-baseline.csv: File directory name must match `model_id` metadata in file name. File should be submitted to directory "hub-baseline" not "team1-goodmodel"
* Run `testthat::snapshot_accept('validate_model_file')` to accept the change.
* Run `testthat::snapshot_review('validate_model_file')` to interactively review the change.
I don't think it blocks the PR but wanted to warn you!
@@ -1,3 +1,7 @@ | |||
# hubValidations 0.0.0.9006 | |||
|
|||
* `validate_pr()` now checks for deletions of previously submitted model metadata files and modifications or deletions of previously submitted model output files, adding an `<error/check_error>` class object to the function output for each detected modified/deleted file (#43 & #44). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some thinking, I wonder if it makes sense to have the possibility (maybe in a future version) to set this check to returns a warning instead of an error. I am particularly thinking about the Scenario projection rounds, where during an ongoing round, a team might want to submit an update of their projections.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So a warning would still get flagged and fail the workflow. The error just made it a bit more visually obvious when printing the results.
Do you want it to be able to pass validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, I guess if it's in the ongoing round, it will make sense for us to have it pass the validation, can it be a parameter the hub set? so that depending on the hub they can decide if they want this type of issue to fail or pass the validation (with just a warning/message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should be something we can introduce indeed. Fancy adding an issue with how you envisage it best working for you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes will do that! thanks
Indeed it is to do with the version of |
Actually it's not the See for example this commit I had to make to update broken tests due to new |
Thanks, I miss the version information on the DESCRIPTION file, sorry about that! Updating |
Yeyyy 🚀 Thanks again for your review @LucieContamin ! |
PR adds functionality to
validate_pr()
to check for deletions of previously submitted model metadata files and modifications or deletions of previously submitted model output files, adding an<error/check_error>
class object to the function output for each detected modified/deleted fileTesting uses the following PR: hubverse-org/ci-testhub-simple#6